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(54) Method and apparatus for encoding video fields 



(57) The present invention relates to a method and 
apparatus for encoding video sequences of a field rep- 
etition rate of 50 Hz and 60 Hz. An improved accuracy 
of film mode detection is achieved by taking an addition- 



al image portion of the fields into account when detect- 
ing motion between fields of the video sequence. This 
results in an increased efficiency of the encoding pro- 
cedure. 
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Description 

[0001] The present invention relates to a method and an apparatus for encoding a sequence of video fields. 
[0002] Image data compression aims to reduce the amount of data to be recorded or transmitted. A conventional, 
5 uncompressed serial digital television signal has a data rate of 270 Mbit/s. Such an amount of data can neither be 
transmitted to each television receiver by normal broadcasting service nor recorded on a storage medium. Because 
of limited bandwidth, memory and computational resources, video data has to be encoded (compressed) in order to 
increase the efficiency of communication systems. 

[0003] Two categories of data compression algorithms can be distinguished, namely lossless and "lossy" algorithms. 

10 Lossy techniques cause image quality degradation in each compression/decompression step. Careful consideration 
of the human visual perception ensures that the degradation is often unrecognisable, although this depends on the 
selected compression ratio. Lossy techniques provide far greater compression ratios than lossless techniques. 
[0004] Lossless encoding guarantees that the decompressed image is absolutely identical to the image before com- 
pression. This is an important requirement for some application domains. Examples for such lossless techniques are 

15 run length encoding, Huffman encoding, and entropy coding. 

[0005] In lossy coding techniques a combined operation is responsible for the compression. A first step is aimed at 
the exploitation of statistical characteristics of an image, in particular of a high correlation of image data and redundancy. 
For this purpose, transform coding methods are employed. These transformations convert a spatial representation of 
two dimensional video data into the frequency domain. Examples for such transformations are the Fourier Transform, 

20 the Discrete Cosine Transform (DCT), the Kahrunen-Loewe Transform. These transformations result in a representa- 
tion having a reduced number of coefficients which contain most of the essential video information by decorrelating 
the original signal. The decorrelation generally results in the signal energy being redistributed among only a small set 
of transform coefficients. Usually, such transformations do not cause any loss of information. 
[0006] In a second step, the obtained transform coefficients are quantized to reduce the amount of information in 

25 the frequency domain. Quantization is usually a non-reversible operation. In this way, many coefficients may be 
discarded after quantization. 

[0007] A general transform coding scheme involves subdividing an image into smaller blocks and performing a unitary 
transform on each sub image. 

[0008] During the last years, a plurality of standardization processes based on transform coding has lead to a plurality 
30 of standards like JPEG, MPEG, etc. These standards address different kinds of source video data and levels of image 
quality. Such source video data might be still pictures, video sequences of progressive or interlaced images. The levels 
of image quality range from VHS quality (MPEG-1 ) to HDTV having a high resolution. 

[0009] Video signals received by a television receiver are generally in interlaced mode. In interlaced mode, only 
fields are transmitted to the receiver. Each field has half the number of lines compared with a complete frame, and 
35 succeeding fields comprise alternate lines of a frame, wherein a first field generally comprises lines having an odd line 
number and the succeeding field comprises lines have an even line number with respect to the lines of a corresponding 
frame. 

[001 0] A television receiver receives fifty fields per second in compliance with PAL television standard and sixty fields 
per second in conformity with NTSC television standard. Consequently, the PAL field repetition rate is 50Hz and NTSC 

<o field repetition rate 60Hz. Such a repetition rate reduces large area flicker when compared with a 25Hz/30Hz repetition 
rate of frames. The PAL and NTSC repetition rates correspond to a new field every 20ms (PAL) or 16.6ms (NTSC). 
[001 1] When a video camera is used as a video source, movements of a filmed object result in different positions of 
said object in consecutive fields. A re-composition of such two fields into a single frame would result in objectionable 
errors, such as saw tooth artefacts, in the reproduced picture as both fields relate to different motion phases. 

45 [0012] Motion picture data is composed of complete frames only. Such motion picture information with a frame rate 
of twenty-four frames per second is converted into an interlaced video format, using a commonly known pulldown 
technique. 

[0013] A 2-2 pulldown technique is employed in order to convert motion picture film into an interlaced PAL video 
signal. The 2-2 pulldown technique generates two fields out of each frame, which is repeated twice. In the 2-2 pulldown 

50 technique, two succeeding fields contain information originating from the same frame. To overcome the disparity in 
frequency, the film is played slightly faster at 25 instead of the original 24 frames per second. 
[0014] When converting motion picture data into NTSC video signals, film data with a rate of 24 frames per second 
is converted into a 60 field per second video signal using a 3-2 pulldown technique. The 3-2 pulldown technique gen- 
erates two video fields for a given film frame and three video fields for the next film frame. 

55 [0015] In order to improve the efficiency of encoding methods it is desirable to base an encoding procedure of inter- 
laced video signals on frames instead of fields. A frame based encoding enables the reduction of redundant information 
of two consecutive fields to a minimum of relevant image information. For this purpose, consecutive fields may be 
reassembled to frames. Due to the time offset between consecutive fields, a reassembling is not possible in case the 
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image content includes moving objects. A frame originating on such two fields reduces the coding efficiency. The field 
data need not to be reassembled as a whole but maybe reassembled per sub-block prior to being subjected to an 
encoding procedure. 

[0016] A re-composition of succeeding fields into frames without a loss of encoding efficiency and image quality 
5 degradation is only possible if a video sequence can reliably determined to originate from motion picture data and the 

video fields which correspond to a common film frame are identified. As no special information is included in video 

signals for indicating whether or not a field originates from motion picture data, a film mode detection is employed 

before a re-composition of fields is carried out. Film mode detection distinguishes whether input video signals originate 

from motion picture data or e.g. from a video camera. 
10 [0017] In case the source of film conversion is an animated or computer generated scene, every pulldown mode is 

possible, e.g. 3-4, 4-4, 4-5 etc. It depends largely on the quality of the scene and the number of source frames available. 

However, there is no standard guiding such conversions. It is a beneficial feature if the film mode detection in an 

encoding apparatus can be set to detect virtually any pulldown. 

[0018] Document EP-A2-0 720 366 describes an apparatus and method for film mode detection. The method ad- 
15 dresses a detection of patterns representative of 2-2 pulldown and 3-2 pulldown film originated frames. A film data 
comparison and accumulation unit calculates differences between pixels of a current field and vertical adjacent pixels 
of a previous field. The calculated differences are accumulated per field. Such field difference information is evaluated 
in a film data reduction unit by determining changes in the field difference information signal on successive fields. The 
changes in the field difference are compared to expected film-generated patterns using correlation techniques. For 
20 this purpose, a delayed field difference signal is subtracted from a current field difference signal. A one-bit per field 
sequence formed by the sign-bit of the resulting difference signal is compared with the predetermined film-generated 
patterns. 

[0019] A further conventional film mode detector for 50Hz television signals is known from WO 94/30006. The de- 
scribed apparatus determines whether a video source is a motion picture film and identifies which two fields are from 

25 the same film frame. The described detection scheme looks for differences in motion between consecutive fields. The 
motion signal, consisting of ones for "motion" and zeros for "no motion", is applied to a sequence detector which looks 
for a "10" pattern in the motion signal. A detection of nine successive sequences of film mode pattern "10" causes the 
detector to enter film mode. A return to camera mode (called video mode) is performed when a detected pattern is 
inconsistent with film mode, namely a pattern of two ones. In case no motion is detected (corresponding to a pattern 

30 of continuous zeros) the present mode, either film or camera mode, is maintained. 

[0020] Video information of an interlaced television signal may originate from motion picture film or from a video 
camera source. In an increasing number of situations an interlaced video signal comprises video information of both 
sources in a single field, namely a main portion in film mode and a smaller portion in camera mode. Such fields regularly 
occur when television signals originating from motion picture film comprise inserted/overlaid information originating 

35 from a video camera source. Such additional information may be a text banner inserted in a bottom area of the video 
image. Said text banner or ticker usually displays service information provided by the broadcaster. Inserted service 
information may relate to the displayed image content, to a later program or to particular news information as e.g. stock 
exchange rates. 

[0021] The inserted ticker may be at different positions within the field, depending on a broadcaster or country wide 
40 convention. 

[0022] The afore mentioned mixed mode fields usually result from assembling video information obtained from dif- 
ferent sources. In particular, new coding schemes as MPEG-4 allow for easy combination of image data originating 
from different sources within a single re-assembled image. Thus, a single field may comprise data originating from 
motion picture film, from a video camera source and/or from computer generated scenes. 
45 [0023] In most fields in "mixed" mode almost all image information originates from image information in a first mode 
wherein only a small proportion results from data in a second mode (e.g. ticker insertion). Conventionally only the 
predominant first mode will be detected. Thus, the small proportion of image information in the second mode is disre- 
garded. 

[0024] It is the object of the present invention to provide an encoding method and encoding apparatus for a sequence 
so of video fields with increased encoding efficiency. 

[0025] This is achieved by the teaching of claim 1 directed to an encoding method and by the teaching of claim 14 
for an encoding apparatus. 

[0026] According to the present invention, redundant image information is reduced by encoding video fields on a 
frame data basis when video data of consecutive fields originating from the same frame are reassembled to frame 
55 based data. The determination of film mode within a sequence of fields is based on a film mode detection. In order to 
detect film mode, pixel differences are calculated for pixels at corresponding pixel positions of consecutive fields. The 
pixel differences are accumulated for at least two different predefined portions of a field. Based on the accumulated 
differences, it is determined whether motion is present between two consecutive fields. This motion information from 
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a sequence of fields is used to detect a predetermined pattern within the motion information. 
[0027] Based on the use of two image portions in order to decide whether motion is present between two consecutive 
fields, film mode detection is performed more accurately and, consequently, the encoding of interlaced video sequences 
is more efficient. 

5 [0028] Different approaches are possible for reassembling two fields to a single frame. First, both fields maybe com- 
bined as a whole or, second, both fields are split up into blocks during the encoding procedure and corresponding 
blocks of both fields are reassembled representing frame based video data. 

[0029] According to a further aspect of the invention, the second portion of a field for accumulating pixel differences 
is adjustable in size and/or position within a field in order to select a particular image region for motion detection. Such 
10 an adaptation of the second portion in size and/or position makes an even more accurate motion detection and, thus, 
film mode detection and encoding efficiency possible. 

[0030] Preferably, motion detection is carried out by comparing the field differences accumulated for two consecutive 
fields with a threshold value. Such a motion detection does not require any complex hardware implementation. 
[0031] In a more preferred embodiment, the threshold value for a current field difference value is the preceding field 
15 difference value. The use of preceding field difference values enables an adaptation of the motion detection to the 
image content. Consequently, a more accurate motion detection and an increased encoding efficiency may be 
achieved. 

[0032] Preferably, the field differences accumulated for a first and second portion of a field are not compared sepa- 
rately with a threshold value, but are accumulated and the sum is applied to the comparing step. With such a combi- 

20 nation, both field difference values are taken into account by a procedure simple to realise. 

[0033] In a preferred embodiment, the second field difference value is multiplied by a weighting factor prior to accu- 
mulation with the first field difference value. Such a weighting procedure allows to emphasise the impact of the second 
portion on the motion detection result. Depending on the application and image content, a second (or further additional 
portions, each of these portions maybe weighted differently) may be taken into account in order to yield an improved 

25 film mode detection for the encoding procedure. 

[0034] Preferably, the threshold value (in particular the preceding field difference) is weighted by a factor larger than 
1. Such a weighting allows to adapt the threshold to a particular image content and to avoid a less accurate motion 
indication. 

[0035] For film mode detection, a determined motion indicator sequence is preferably compared to pulldown patterns 
30 generated when converting motion picture film data into video field sequences. In most cases, a pulldown pattern for 
a conversion into PAL or NTSC video signals is employed. 

[0036] According to a further aspect of the present invention, a film mode indication is not terminated as soon as the 
predefined pattern is not detected within the sequence of motion indicators. Preferably, a film mode indication is 
switched to a camera mode indication after said predefined pattern is not detected for a predetermined number of 
35 times. In order to avoid sudden changes of the resulting image quality, short film mode interruptions are suppressed 
and a continuous film mode is assumed. 

[0037] Further advantageous embodiments of the present invention are the subject-matter of dependent claims. 
[0038] The invention will be further described with reference to the accompanying drawings, in which: 

40 Fig. 1 represents a schematic block diagram for encoding video data. 

Fig. 2 represents a schematic block diagram of an encoding apparatus of video data further taking similarities 
between encoding images into account. 

45 Fig. 3a, 3b show timings of film frames converted to interlaced television signals according to PAL and NTSC 

standard, and the same signal delayed for a period of one field. 

Fig. 4 shows a top-level block diagram of a film mode detection unit in an encoder according to the present invention, 
so Fig. 5 is a simplified block diagram showing details of an embodiment of a field difference unit (201). 

Fig 6 is a simplified block diagram showing details of an embodiment of a sequence register (202). 
Fig. 7 is a flow-chart showing steps of a pattern matching algorithm, as embodied by the pattern matching unit (203). 

55 

Fig. 8 shows examples of a pattern matching for PAL television signals. 
Fig. 9 shows examples of a pattern matching for NTSC television signals. 



4 



EP 1 198 139 A1 



Fig. 10 shows an example of an image with a ticker insertion. 

[0039] Fig. 1 represents a schematic block diagram of a conventional encoding apparatus for video data. First, an 
input terminal receives the video data to be encoded. Each field or frame of video data is divided into block units of a 

5 predetermined size. Each block data is orthogonally transformed so that video data are transformed into transform 
coefficients in the frequency domain. A well known transformation method is the Discrete Cosine Transform (DCT). A 
quantisizer changes the obtained transform coefficients through a predetermined quantisizing process into represent- 
ative values of various levels. In order to avoid an overflow or an underflow of transmission data the quantising pro- 
cedure is performed depending on a quantisation level set from a buffer stage. The quantisized coefficients are variable 

10 length coded taking statistical characteristics of quantisation coefficients into consideration. The buffer delivers the 
compressed data from the variable length coder (vie) usually to a transmission channel at a constant transmission rate 
or to a recording device. Recording and transmission maybe carried out with a constant or a variable bit rate. 
[0040] The encoding efficiency may be increased when considering that there are many similar patterns between 
adjacent frames and fields in the video data. Accordingly, in case of slight motion of an image or an image portion, 

*5 motion may be estimated by comparing a present field or frame with a previous field or frame. A motion vector is 
calculated as a result of such motion estimation. Motion compensation is achieved from previous frames or fields with 
such a motion vector. A feedback loop chain for performing the motion estimation and compensation is shown in fig. 
2. In order to further reduce the data to be encoded, a DPCM encoder comprises a feedback loop chain for performing 
decoding and image prediction based on detected motion vectors. 

20 [0041] Interlaced video data may be encoded separately per field or by reassembling fields to frames. In order to 
avoid an increased encoding effort, fields are encoded separately in case of motion. When transmitting such image 
information through a transmission channel of a constant transmission rate, the image quality is increased only during 
a frame based encoding. 

[0042] Each switching between both coding modes, namely a framed based encoding and a field based encoding 
25 maybe perceivable by a viewer. Thus, a frequent change of the encoding modes should be avoided. 

[0043] As many video data are based on motion picture film converted into interlaced video signals, such video 
signals may regularly be reassembled and encoded on a frame bases. In order to reliably detect video fields originating 
from motion picture frames, a film mode detector is well known in the art. 

[0044] Such film mode detectors detect particular patterns of motion in a sequence of video fields to be encoded. 
30 Figures 3 to 1 0 show how to detect film mode from a sequence of interlaced video field and how to improve the encoding 
of video fields. 

[0045] The top graphs in fig. 3a and 3b show a time scale wherein each division corresponds to the time period for 
one field of a corresponding television signal. Each time division in fig. 3a corresponds to 20ms and in fig. 3b to 1 6.6ms. 
The second graphs in fig. 3a and 3b give a corresponding time interval of a motion picture film frame having numbers 
35 0, 1 , 2, 3... For generating a PAL television signal, a motion picture frame rate of 25 frames per second is used wherein 
for a NTSC television signal a motion picture frame rate of 24 frames per second is used. 

[0046] The third graphs in fig. 3a and 3b show a television signal F 0 comprising video fields. Each field of odd lines 
is marked with an "A" and each field of even lines with a "B". The film frame from which each field originates is indicated 
by the lower indexes. For instance, video fields Aq and B 0 are based on film frame "0". 

40 [0047] Video signal represents a television signal which corresponds to F 0 wherein each field being delayed by 
a time interval of one field. Correspondingly, a video signal F 2 would indicate a delay of two fields with respect to F 0 . 
[0048] The last graph in fig. 3a and 3b show whether an overall difference between F 1 and F 0 results in a low L or 
high H difference value. Each difference value L or H represents a low or high accumulated difference between the 
above depicted fields, respectively. As can be seen from figures 3a and 3b a difference between fields of the same 

45 index (originating from the same frame) result in a low difference level L and differences between fields of different 
index numbers (originating from different frames) result in a high H difference level. 

[0049] When comparing the sequences of difference levels in figures 3a and 3b the difference level pattern result 
from the different employed pulldown techniques, either 2-2 or 3-2 pulldown. 

[0050] A configuration of a film mode detection unit in an encoder according to the present invention is shown in fig. 
50 4. The film mode detector 205 comprises three main components, namely a film difference unit 201 , a sequence register 
202 and a pattern matching unit 203. Further, the film mode detector comprises a mode switch 204 in order to output 
a particular mode, either film mode or camera mode. 

[0051] The film mode detector 201 further outputs a "phase" signal indicating which two of the fields F 0 , F v F 2 were 
generated from the same film frame. This information may alternatively be obtained from a sequence of motion indi- 
55 cators at a later stage. 

[0052] The input video signal Y is supplied to a raster position correction unit 206. Vertically adjacent pixels of con- 
secutive fields are adjusted to have corresponding pixel positions, providing raster neutral and thus directly comparable 
fields F 0 . 
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[0053] The video signal F 0 is delayed by a field delay memory 207 providing a video signal F v Both video signals 
F 0 , fi are supplied to a field difference unit 201 of said film mode detection apparatus 205. The field difference unit 
201 calculates absolute differences between pixels at corresponding horizontal and vertical pixel positions of two con- 
secutive fields F 0 , F v 

5 [0054] The absolute difference values are accumulated for all pixels of a F 0 . The resulting field difference value FD 01 
(between a current field of video signal F 0 and video signal F,), and the stored field difference value FD 12 (between 
video signals Fj and F 2 ) are supplied to a sequence register 202. Based on the calculated field differences FD the field 
difference unit 201 detects motion between consecutive fields and outputs a motion indicative phase bit for each tran- 
sition between two fields. 

10 [0055] According to the present invention, the field difference values FD 01 , FD 12 comprise difference values accu- 
mulated for two different portions of pixels of a field, respectively having first accumulated differences values FD 01A , 
FD 12A for a first portion of pixels and second accumulated difference values FD 01B , FD 12B for a second portions of 
pixels within a field. Additional field difference values maybe calculated for further portions within a field and handled 
respectively. 

15 [0056] Sequence register 202 receives the four field differences FD and detects whether motion is present between 
succeeding fields based on both field differences FD A and FD B for each field. A corresponding motion bit is generated 
and stored in the register. Preferably, said sequence register is configured to store 40 motion bits. 
[0057] The stored motion bits are supplied to a pattern matching unit 203 for determining whether or not the input 
video signal originates from motion pictures frames. Depending on the detected patterns, a mode switch 204 outputs 

20 a "mode" signal indicating "film mode" or "camera mode". 

[0058] The field difference unit 201 will be described in more detail with reference to figure 5. The field difference 
unit 201 receives simultaneously video data F 0 , representing two consecutive video fields. For a more accurate 
result, the processing of the field difference unit 201 is based on luminance information represented by an 8 bit data 
word per pixel. 

25 [0059] In a different implementation the luminance and chrominance information might well have a data width of 
more than eight bits. In this case ail following data processing is understood with a broader data path. Thus it can also 
be handled by this invention. 

[0060] Based on the pixels P 0 , P v which are in the same spatial position n and belong to corresponding fields F 0 , 
F v the absolute pixel difference values APD are calculated in step 301. Each absolute pixel difference APD n again 
30 has a data width of eight bits. In order to emphasis on large differences only and on computational effort, the value of 
each APD may be shifted in step 302 by a shifting value SPD to the right in order to reduce the amount of data to be 
handled and to reduce the influence of noise on the determination result. It has turned out to be most effective to shift 
by a maximum of three bits. This operation is indicated by equation (1 ). 

APD n ^\P no -P n ^\»SPD (1) 

[0061] The absolute differences APD are accumulated for at least two different portions A and B of pixels within a 
field, separately. Such a separate processing is indicated by separate processing paths in fig. 5 branching at decision 
40 step 303. 

[0062] Preferably, said first image portion A comprises essentially all active pixels of a field. The second and further 
additional image portions B only comprise a particular, considerably smaller image portion for emphasizing motion 
detection between two fields on that smaller image portion. As will be explained in more detail below, the sensitivity 
for motion within said second portion B might be further increased by increasing the impact of said second portion on 

45 the motion detection procedure. 

[0063] Depending on the particular area predefined for each portion A and B of the fields, the respective calculated 
absolute differences are accumulated in steps 304A, 304B separately. For the first portion A absolute pixel differences 
are accumulated in step 304A preferably for all pixels of a field or a particular portion having an aspect ratio of 16:9. 
The field difference unit 201 may switch between both areas for the first portion A depending on a particular flag FMDR. 

50 [0064] Pixels of a particular image portion B are accumulated in step 304B depending on an area definition by pa- 
rameters TS and TH. These parameters define the position and height of portion B within the fields. An example of a 
possible area definition within a field can be obtained by reference to figure 10. 

[0065] The resulting accumulated difference values may be shifted to the right in order to reduce hardware complexity 
of the further processing stages. It has turned out that at least a 2 bit shifting is advantageous for phase indication, 
55 because influences due to gaussian noise in the video signal are eliminated. The field difference values arrive at larger 
values proportional to both, the pixel data width and the field dimensions. Thus, the corresponding parameter SFD 
must be chosen adaptively to this proportional constants. The shift operation is shown in equation (2). 
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>>(2 + SFD) 



(2) 



5 



wherein FD 01 representing a field difference between a current field F 0 and a preceding field F., and n equals the 
number of active pixels in the field. The field difference unit outputs both accumulated field difference values FD 01At 
10 fd oib which are calculated with respect to fields of F 0 and F v In addition, the preceding field difference values FD 12A , 
FD 12 b are supplied from the field difference unit 201 to the following processing stage, namely the sequence register 
202. 

[0066] The field difference unit 201 calculates in step 307 a phase indicator of one bit in order to indicate which of 
two fields are from the same film frame. A one indicates that fields F 1 and F 2 introduce a smaller field difference signal 
15 FD 12A tnan tne f,eld F o and F 1 represented by the field difference signal FD 01A . A zero determines that F 0 and F n are 
more similar and thus could originate from the same film frame. In an alternative embodiment, this information is ob- 
tained from the stored motion indicators. 

[0067] Fig. 6 shows a simplified block diagram of a sequence register 202. The sequence register 202 receives four 
field difference values supplied by said field difference unit 201, namely FD 01A and FD 01B , FD 12A and FD 12B . These 
20 field difference values relate to field differences of three consecutive fields and two different image portions A and B. 
Preferably, the field difference values relating to the same fields are combined before detecting motion information. 
The combination procedure will be described below in more detail. 

[0068] In general, the motion bit is generated based on the value of field differences. Unfortunately, field differences 
are not only due to motion between consecutive fields but may also be due to vertical transitions, e.g. a horizontal dark 
25 line in front of a bright background. Such differences result from a vertical offset between odd and even fields. Thus, 
motion detection cannot only be based on the value of a field difference as this value also results from vertical transitions 
in a single frame. 

[0069] The present invention uses an adaptive relative threshold in order to detect a transition between frames. 
According to the present invention, a current field difference FD 01 and the preceding field difference values FD 12 are 

30 compared. The preceding field difference value FD 12 is chosen rather to be weighted in step 401 by a factor K. Factor 
K preferably larger than 1 for best operation takes one of four values 2, 3/2, 4/3, 5/4. Any smaller or higher value did 
not lead to a better result. A field difference is considered to represent motion when the current field difference FD 01 
being equal or greater than the preceding field difference FD 12 multiplied by factor K. Thus, if K=2, the current field 
difference FD 01 has to be twice as large as the preceding field difference FD 12 in order to assume motion between 

35 frames F 0 and F lt otherwise, the fields F 1 and F 2 originate from the same frame. This comparison procedure is given 
in equation (3). 



[0070] Correspondingly, a motion bit will be supplied to the sequence register. The sequence register 405 is being 
configured to store the number of motion bits necessary to reliably detect a film scene. It must not be longer than an 
average scene. Because then an interruption in the pattern might occur and the film mode will not be detected. Register 
405 is a FIFO buffer (first in first out) with random access. 
45 [0071 ] As motion detection is based on a particular amount of differences between consecutive fields, it may happen 
that motion detection fails between fields having no or only few motion in between. Such erroneously not detected film 
mode does not affect image quality and the encoding of fields having almost no motion is not adversely effected when 
assembled to frames. 

[0072] Currently broadcast television programs occasionally comprise a ticker inserted within the lower third of a 
50 screen. As the main portion of each field may be in film mode, film mode will be detected. When the ticker insertion is 

in camera mode, a scan- or upconversion from interlaced to e.g. progressive (i.e. a sequence of complete frames) 

would result in deteriorated, hardly readable text of the ticker and the encoding efficiency is correspondingly effected. 

In such a case, namely when video data in camera mode is inserted into a field generally being in film mode, "camera 

mode" has to be detected in order to enable an efficient encoding. 
55 [0073] For that purpose, the field difference unit 201 additionally calculates (at least) a second field difference value 

for a second portion B of pixels within each field. Two parameters define the second portion of pixels, namely TS and 

TH. Parameter TS indicates the vertical position of the top of the second portion B with respect to the top of the fields. 



FD 12 * K £ FD, 



'01 



(3) 



40 
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The second parameter TH indicates the height of the second portion with respect to the remaining height of the field, 
i.e. the distance between the top of the second portion B and the bottom of the fields. This is shown in fig. 10. Within 
a field 801, parameters TS and TH give the vertical position and size of the second portion B. When using these 
predefined values for TS and TH only two bit data have to be used for the second portion B resulting in a simple 
hardware configuration. In a further preferred embodiment, the horizontal size of portion B maybe varied. 
[0074] The field difference unit 201 outputs four field difference values FD 01A , FD 01B , FB 12A , FB 12B . The field differ- 
ence values FD 01B , FD 12B resulting from the second portion are weighted in order to adjust the influence of the second 
portion on the motion detection result. For such a weighting procedure, the filed difference values for portion B are 
shifted in step 402 to the left. The amount of the left shift of the field difference values is set by parameter TW, which 
must be chosen adapetively to the overall field motion. Only this leads to a result accurate for all scenes. The comparison 
procedure carried out by motion detection unit 404 is given by equation (4). 

K*(FD, 2A + F0 12S « TW) £ (FD 0 , A + FD 01S « TW) (A) 

[0075] Fig. 7 represents a flow-chart of those steps carried out during pattern matching. In order to detect film mode 
in a sequence of motion bits which are provided by sequence register 202, a particular pattern of motion bits has to 
be detected. 

[0076] Motion picture conversion according to 2-2 pulldown and 3-2 pulldown results in a characteristic pattern of 
stored motion bits. When an interlaced video, in particular television signal being generated by a 2-2 pulldown, FIFO 
405 stores a repeated sequence of bits "10". In case of an NTSC video signal, generated by the 3-2 two pulldown 
technique, a bit sequence of "10010" occurs repetitively in register 405 when read from right to left In the following 
paragraphs these two bit sequences are referred to as reference pattern. One of both may be selected depending on 
a parameter FORMAT (0=PAL, 1 =NTSC). 

[0077] A simple algorithm of pattern matching may detect one complete pattern in the motion bit sequence and 
correspondingly indicate film mode when detected. In order to provide a more reliable film mode indication, the present 
invention examines forty motion bits. Such a more complex matching procedure allows to implement detection error 
concealment algorithms. 

[0078] According to the present invention pattern matching generates four different matching result registers (full_pat, 
beg_pat, endjDat, no_pat). Three of them indicate a matching result of the selected reference pattern with the stored 
data, namely full_pat for a complete pattern matching, beg_pat for the pattern running into the register and end_pat 
for the pattern running out of the register. Finally, it is detected whether no pattern or a destroyed pattern is present 
(no_pat). 

[0079] The search procedure will be described in detail with reference to figure 7. Index i is provided in order to select 
all registered positions 1 to 40 of register 405. For pattern detection, the following steps are carried out. 
[0080] First, index i is incremented by the length of the reference pattern, i.e. by 2 for a PAL television signal and by 
5 for a NTSC television signal. 

[0081] Second, in case the reference pattern is completely detected for the corresponding register positions, register 
full_pat is incremented by 1. An example for this is given in fig. 8a and fig. 8b for a PAL television signal at register 
positions 2 and 4. A corresponding example is given for a NTSC television signal in fig. 9a and fig. 9b. 
[0082] In addition, a beginning pattern is searched when, at the beginning of the register, a number of bits remain 
having a number less than a complete pattern. Fig. 8c, Fig. 9c and Fig. 9d give examples for a detected "beginning 
pattern" preceding an already detected pattern for the immediate consecutive motion bit positions. When a partial 
pattern is found at the beginning of the register, which is in positional conformity with the detected complete pattern, 
register beg_pat is incremented by 1 . The same situation applies to fig. 8d and fig. 9e and f. The situations correspond 
to those of fig. 8c and fig.9c and d. The only difference is that a first complete pattern could not be detected due to 
motion detection errors. 

[0083] Finally, an ending pattern is shown in fig. 8e and fig. 9g and h. Register end_pat is incremented in situations 
corresponding to those of the beg_pat register. 

[0084] In case no complete pattern has been detected, index i and register no_pat are incremented by one. Then, 
the processing returns to the first step. 

[0085] Next, the conditions for signalling a mode switch are checked and, when the conditions are met or i = 40, the 
search is stopped. 

[0086] The two existing reference patterns described for a conversion into PAL and NTSC interlaced video sequences 
may be replaced by new reference patterns relating to e.g. computer generated sequences. 

[0087] Referring to fig. 4, the operation of mode switch 204 is described. Mode switch 204 receives count values for 
each pattern register, namely the registers full _pat, beg_pat, end _pat and no_pat. In addition, the mode switch receives 
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parameters defining switching conditions in order to provide a more reliable film mode indication. These parameters, 
CDLY and FDLY, determine when to switch from film mode indication to camera mode indication and vice versa in 
order to effect a hysteresis behaviour. Such a hyseresis behaviour smoothes the switching between film and camera 
mode resulting in a less obvious change of image quality. 
5 [0088] CDLY determines the number of fields which do not conform to a film mode pattern before the mode switch 
returns to camera mode. A single pattern not being equal to a film mode pattern should not initiate a return to camera 
mode. Depending on the image content and quality a number of 2 to 5 consecutive not detected film mode patterns 
provides good results with respect to image quality improvements. 

[0089] The second parameter FDLY determines the number of consecutive fields which have to be in conformity 
10 with a reference pattern before mode switch 204 outputs a film mode indication. For a switch to film mode a single 
pattern not being equal to a film mode pattern should not be admitted. 

[0090] These switching conditions are given by the following formula wherein equation (5) relates to a switch from 
film to camera mode and equation (6) relates to a switch from camera into film mode. 

15 

(no_pat > CDLY) && !beg_pat (5) 



full _pat* (2-PAU5-NTSC) + beg_pat+end_pat > FDLY (6) 

20 

[0091] Summarising, the present invention provides a considerable increase in accuracy of film mode detection and 
thus coding efficiency by taking an additional image portion of the fields into account when detecting motion between 
fields of a video sequence. 

25 

Claims 

1. A method for encoding a sequence of video fields, comprising the steps of: 

30 receiving a sequence of video fields to be encoded, 

calculating (301 ) absolute pixel differences between spatially corresponding pixels of a current and a previous 
field, 

35 accumulating (304A, 304 B) said absolute pixel differences for a first predefined portion and a second prede- 

fined portion of a field and providing a first and a second field difference value (FD A FD B ), 

determining (404) whether or not motion is present between a current and a previous field based on the first 
and second field difference values (FD A FD B ) and generating a motion indicator depending on the determi- 
40 nation result, 

indicating film mode when a predefined pattern is detected within a sequence of motion indicators, 

re-assembling the video data of two consecutive fields originating from the same film frame into frame based 
45 video data when film mode is detected for these fields, 

encoding said re-assembled frame based data and 

outputting the encoded video data. 

so 

2. A method according to claim 1 wherein said second portion of a field is adjustable in size and/or position within a 
field. 

3. A method according to claim 1 or 2 wherein motion between two consecutive fields is determined by comparing 
55 a field difference value accumulated for a current field with a predetermined threshold value and motion is deter- 
mined when said field difference value of the current field is larger than said threshold value. 

4. A method according to claim 3, wherein said threshold value being a field difference value accumulated for the 
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previous field. 

5. A method according to claim 3 or 4 wherein the first and second field difference values of a field are accumulated 
prior to said comparing step. 

6. A method according to any of claims 3 to 5, wherein said threshold value is multiplied (401) by a factor (K) prior 
to said comparing step, said factor (K) being larger than 1. 

7. A method according to claim 5 or 6, wherein said second field difference (FD B ) value is multiplied (402) by a 
weighting factor (TW) prior to be accumulated with said first field difference value (FD A ), said factor (TW) having 
a value larger than 1 . 

8. A method according to any of claims 1 to 7, wherein said film mode detecting step comprises the steps of: 

storing motion indicators determined for a sequence of consecutive fields, 
detecting said predefined pattern within said sequence of stored motion indicators, 

indicating film mode when said predefined pattern is detected within said sequence of stored motion indicators. 

9. A method according to any of claims 1 to 8, wherein said predefined pattern being a pull-down pattern for the 
conversion of film frames into video fields. 

10. A method according to any of claims 1 to 9, wherein an indication of film mode is terminated after said predefined 
pattern is not detected for a predefined number of consecutive fields. 

11. A method according to any of claims 1 to 10 wherein said pixel differences are calculated based on luminance 
information of the pixels. 

12. A method according to any of claims 1 to 11 wherein an indication which two of three consecutive fields originate 
from the same film frame is generated based on the first field difference values (FD 01A FD 12A ). wherein those two 
fields are indicated to originate from same film frame having the smaller first field difference values (FD 01A FD 12A ). 

1 3. A method according to any of claims 1 to 1 2 wherein the encoding step subjects the two dimensional video data 
to an orthogonal transform in order to transform the video data into transform coefficients in the frequency domain. 

14. An encoding apparatus for encoding a sequence of video fields, comprising: 

a subtractor (301) for calculating pixel differences between spatially corresponding pixels of a current and a 
previous field, 

a rectifier (301 ) for receiving said calculated pixel differences and providing absolute pixel differences, 

a first and a second accumulator (304A, 304B)for accumulating said absolute pixel differences received from 
said rectifier for a first and a second portion within a field and providing a first and a second field difference value, 

a motion detection unit (202) for determining whether or not motion is present between a current and a previous 
field based on the field difference values and generating a corresponding motion indicator, 

a pattern matching unit (203) for comparing a predefined pattern with a sequence of generated motion indi- 
cators, 

a mode switch (204) for generating a film mode indication based on a detection of said predefined pattern 
within said motion indicators, 

a re-assembling unit for re-assembling video data of two consecutive fields originating from the same film 
frame into frame based video data when film mode is detected for these fields and 
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an encoding unit for encoding the video data of said re-assembled frame based video data. 

15. An encoding apparatus according to claim 14, further comprising offset control means, adjusting the position of 
said second portion within a field and/or size control means, adjusting the size of said second portion in vertical 

5 and/or horizontal direction. 

16. An encoding apparatus according to claim 14 or 15 further comprising a comparator (404) for comparing a field 
difference value accumulated for a current field with a predetermined threshold value and motion being determined 
when said field difference value of the current field being larger than said threshold value. 

10 

17. An encoding apparatus according to claim 16, wherein a field difference value accumulated for the previous field 
being applied to said comparator (404) as said threshold value. 

18. An encoding apparatus according to claims 16 or 17 further comprising adding means (403) for adding the first 
15 and the second field difference values of a field before providing the sum of both field difference values to said 

comparator (404). 

19. An encoding apparatus according to any of claims 16 to 18, further comprising a multiplier (401) for multiplying 
the field difference value of the previous field by a factor (K) and providing the result to said comparator (404), 

20 said factor (K) being larger than 1 . 

20. An encoding apparatus according to claims 18 or 19, further comprising a multiplier (402) for multiplying said 
second field difference value by a weighting factor (TW) and providing the multiplication result to said adder (403), 

said factor (TW) having a value larger than 1 . t 

25 

21. An encoding apparatus according to any of claims 14 to 20, further comprising a memory for storing a pull-down 
pattern of a particular conversion of film frames into video fields, said stored pull-down pattern being applied to 
said pattern matching unit (203) as said predefined pattern. 

30 22. An encoding apparatus according to any of claims 14 to 21 , wherein 

said pattern matching unit (203) comprises a register (no_pat) storing a count value which is incremented 
based on the comparation result each time said predefined pattern is not detected within said sequence of 
motion indicators, 

35 

said mode switch (204) comprising a comparator for detecting whether or not the count value provided from 
said register (no_pat) exceeds a predetermined threshold value (CDLY) and 

said mode switch (204) maintains the film mode indication based on the comparation result as long as said 
*o count value does not exceed the predetermined threshold value (CDLY). 

23. An encoding apparatus according to any of claims 14 to 22, further comprising a signal separator to only provide 
the luminance information of the pixels to said subtracter (301). 

45 24. An encoding apparatus according to any of claims 14 to 23 further comprising a phase detector (307) receiving 
the two first field difference values (FD 01A , FD 12A ) calculated with respect to three consecutive fields (F 0 F., F 2 ) 
for indicating which two of the three consecutive fields originate from the same film frame based on the first field 
difference values, wherein those two fields are indicated to originate from the same film frames having the smaller 
field difference value. 

50 

25. An encoding apparatus according to any of claims 13 to 22, wherein said encoding unit comprising a transform 
unit for performing an orthogonal transform with respect to two-dimensional video data in order to provide transform 
coefficients in the frequency domain. 

55 
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